AITopics | stylistic feature

Collaborating Authors

stylistic feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Missing the human touch? A computational stylometry analysis of GPT-4 translations of online Chinese literature

Yao, Xiaofang, Kang, Yong-Bin, McCosker, Anthony

arXiv.org Artificial IntelligenceNov-26-2025

Existing research suggests that machine translations of literary texts remain unsatisfactory. Such quality assessment often relies on automated metrics and subjective human ratings, with little attention to the stylistic features of machine translation. Empirical evidence is also scant on whether the advent of AI will transform the literary translation landscape, with implications for other critical domains for translation such as creative industries more broadly. This pioneering study investigates the stylistic features of AI translations, specifically examining GPT -4's performance against human translations in a Chinese online literature task. Our computational stylometry analysis reveals that GPT -4 translations closely mirror human translations in lexical, syntactic and content features. As such, AI translations can in fact replicate the'human touch' in literary translation style. The study provides critical insights into the implications of AI on literary translation in the posthuman, where the line between machine and human translations may become increasingly blurry.

large language model, machine learning, translation, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1075/ts.24043.yao

2506.13013

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
Oceania (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts

Rezaei, Mosab, Moghadam, Mina Rajaei, Shaikh, Abdul Rahman, Alhoori, Hamed, Freedman, Reva

arXiv.org Artificial IntelligenceNov-26-2025

Abstract--Recent advances in large language models have created new opportunities for stylometry, the study of writing styles and authorship. Two challenges, however, remain central: training generative models when no paired data exist, and evaluating stylistic text without relying only on human judgment. In this work, we present a framework for both generating and evaluating sentences in the style of 19th-century novelists. Large language models are fine-tuned with minimal, single-token prompts to produce text in the voices of authors such as Dickens, Austen, Twain, Alcott, and Melville. T o assess these generative models, we employ a transformer-based detector trained on authentic sentences, using it both as a classifier and as a tool for stylistic explanation. We complement this with syntactic comparisons and explainable AI methods, including attention-based and gradient-based analyses, to identify the linguistic cues that drive stylistic imitation. Our findings show that the generated text reflects the authors' distinctive patterns and that AI-based evaluation offers a reliable alternative to human assessment. All artifacts of this work are published online. The ability to recognize and reproduce an author's writing style has long fascinated both literary scholars and computer scientists. Stylometry, the quantitative study of writing style, rests on the idea that every author leaves behind unconscious patterns in vocabulary, syntax, and rhythm [2, 3]. These patterns have been analyzed for centuries in questions of disputed authorship, the study of literary traditions, and more recently in applications such as security and forensics [4].

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.20459

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > Mississippi (0.04)
North America > United States > Illinois (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Publishing (0.61)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SLIM-LLMs: Modeling of Style-Sensory Language RelationshipsThrough Low-Dimensional Representations

Khalid, Osama, Srivastava, Sanvesh, Srinivasan, Padmini

arXiv.org Artificial IntelligenceAug-6-2025

Sensorial language -- the language connected to our senses including vision, sound, touch, taste, smell, and interoception, plays a fundamental role in how we communicate experiences and perceptions. We explore the relationship between sensorial language and traditional stylistic features, like those measured by LIWC, using a novel Reduced-Rank Ridge Regression (R4) approach. We demonstrate that low-dimensional latent representations of LIWC features r = 24 effectively capture stylistic information for sensorial language prediction compared to the full feature set (r = 74). We introduce Stylometrically Lean Interpretable Models (SLIM-LLMs), which model non-linear relationships between these style dimensions. Evaluated across five genres, SLIM-LLMs with low-rank LIWC features match the performance of full-scale language models while reducing parameters by up to 80%.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.02901

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Merced County > Merced (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

ArchiLense: A Framework for Quantitative Analysis of Architectural Styles Based on Vision Large Language Models

Zhong, Jing, Yin, Jun, Li, Peilin, Zeng, Pengyu, Zang, Miao, Luo, Ran, Lu, Shuai

arXiv.org Artificial IntelligenceAug-5-2025

Architectural cultures across regions are characterized by stylistic diversity, shaped by historical, social, and technological contexts in addition to geograph-ical conditions. Understanding architectural styles requires the ability to describe and analyze the stylistic features of different architects from various regions through visual observations of architectural imagery. However, traditional studies of architectural culture have largely relied on subjective expert interpretations and historical literature reviews, often suffering from regional biases and limited ex-planatory scope. To address these challenges, this study proposes three core contributions: (1) We construct a professional architectural style dataset named ArchDiffBench, which comprises 1,765 high-quality architectural images and their corresponding style annotations, collected from different regions and historical periods. (2) We propose ArchiLense, an analytical framework grounded in Vision-Language Models and constructed using the ArchDiffBench dataset. By integrating ad-vanced computer vision techniques, deep learning, and machine learning algo-rithms, ArchiLense enables automatic recognition, comparison, and precise classi-fication of architectural imagery, producing descriptive language outputs that ar-ticulate stylistic differences. (3) Extensive evaluations show that ArchiLense achieves strong performance in architectural style recognition, with a 92.4% con-sistency rate with expert annotations and 84.5% classification accuracy, effec-tively capturing stylistic distinctions across images. The proposed approach transcends the subjectivity inherent in traditional analyses and offers a more objective and accurate perspective for comparative studies of architectural culture.

machine learning, natural language, wang, (16 more...)

arXiv.org Artificial Intelligence

2506.07739

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > Singapore (0.04)
Asia > China > Hubei Province (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry: Construction & Engineering (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning

Sui, Peiqi, Rodriguez, Juan Diego, Laban, Philippe, Murphy, Dean, Dexter, Joseph P., So, Richard Jean, Baker, Samuel, Chaudhuri, Pramit

arXiv.org Artificial IntelligenceJun-4-2025

Each year, tens of millions of essays are written and graded in college-level English courses. Students are asked to analyze literary and cultural texts through a process known as close reading, in which they gather textual details to formulate evidence-based arguments. Despite being viewed as a basis for critical thinking and widely adopted as a required element of university coursework, close reading has never been evaluated on large language models (LLMs), and multi-discipline benchmarks like MMLU do not include literature as a subject. To fill this gap, we present KRISTEVA, the first close reading benchmark for evaluating interpretive reasoning, consisting of 1331 multiple-choice questions adapted from classroom data. With KRISTEVA, we propose three progressively more difficult sets of tasks to approximate different elements of the close reading process, which we use to test how well LLMs may seem to understand and reason about literary works: 1) extracting stylistic features, 2) retrieving relevant contextual information from parametric knowledge, and 3) multi-hop reasoning between style and external contexts. Our baseline results find that, while state-of-the-art LLMs possess some college-level close reading competency (accuracy 49.7% - 69.7%), their performances still trail those of experienced human evaluators on 10 out of our 11 tasks.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2505.09825

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(14 more...)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.87)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

What's in a prompt? Language models encode literary style in prompt embeddings

Sarfati, Raphaël, Moller, Haley, Liu, Toni J. B., Boullé, Nicolas, Earls, Christopher

arXiv.org Artificial IntelligenceMay-26-2025

Large language models use high-dimensional latent spaces to encode and process textual information. Much work has investigated how the conceptual content of words translates into geometrical relationships between their vector representations. Fewer studies analyze how the cumulative information of an entire prompt becomes condensed into individual embeddings under the action of transformer layers. We use literary pieces to show that information about intangible, rather than factual, aspects of the prompt are contained in deep representations. We observe that short excerpts (10 - 100 tokens) from different novels separate in the latent space independently from what next-token prediction they converge towards. Ensembles from books from the same authors are much more entangled than across authors, suggesting that embeddings encode stylistic features. This geometry of style may have applications for authorship attribution and literary analysis, but most importantly reveals the sophistication of information processing and compression accomplished by language models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.17071

Country:

Europe > United Kingdom > England (0.05)
North America > United States > Virginia (0.05)
Europe > France (0.04)
(5 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Integrated ensemble of BERT- and features-based models for authorship attribution in Japanese literary works

Kanda, Taisei, Jin, Mingzhe, Zaitsu, Wataru

arXiv.org Artificial IntelligenceApr-14-2025

Traditionally, authorship attribution (AA) tasks relied on statistical data analysis and classification based on stylistic features extracted from texts. In recent years, pre-trained language models (PLMs) have attracted significant attention in text classification tasks. However, although they demonstrate excellent performance on large-scale short-text datasets, their effectiveness remains under-explored for small samples, particularly in AA tasks. Additionally, a key challenge is how to effectively leverage PLMs in conjunction with traditional feature-based methods to advance AA research. In this study, we aimed to significantly improve performance using an integrated integrative ensemble of traditional feature-based and modern PLM-based methods on an AA task in a small sample. For the experiment, we used two corpora of literary works to classify 10 authors each. The results indicate that BERT is effective, even for small-sample AA tasks. Both BERT-based and classifier ensembles outperformed their respective stand-alone models, and the integrated ensemble approach further improved the scores significantly. For the corpus that was not included in the pre-training data, the integrated ensemble improved the F1 score by approximately 14 points, compared to the best-performing single model. Our methodology provides a viable solution for the efficient use of the ever-expanding array of data processing tools in the foreseeable future.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2504.08527

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Tōhoku (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Add feedback

An exploration of features to improve the generalisability of fake news detection models

Hoy, Nathaniel, Koulouri, Theodora

arXiv.org Artificial IntelligenceFeb-27-2025

Fake news poses global risks by influencing elections and spreading misinformation, making detection critical. Existing NLP and supervised Machine Learning methods perform well under cross-validation but struggle to generalise across datasets, even within the same domain. This issue stems from coarsely labelled training data, where articles are labelled based on their publisher, introducing biases that token-based models like TF-IDF and BERT are sensitive to. While Large Language Models (LLMs) offer promise, their application in fake news detection remains limited. This study demonstrates that meaningful features can still be extracted from coarsely labelled data to improve real-world robustness. Stylistic features-lexical, syntactic, and semantic-are explored due to their reduced sensitivity to dataset biases. Additionally, novel social-monetisation features are introduced, capturing economic incentives behind fake news, such as advertisements, external links, and social media elements. The study trains on the coarsely labelled NELA 2020-21 dataset and evaluates using the manually labelled Facebook URLs dataset, a gold standard for generalisability. Results highlight the limitations of token-based models trained on biased data and contribute to the scarce evidence on LLMs like LLaMa in this field. Findings indicate that stylistic and social-monetisation features offer more generalisable predictions than token-based methods and LLMs. Statistical and permutation feature importance analyses further reveal their potential to enhance performance and mitigate dataset biases, providing a path forward for improving fake news detection.

dataset, generalisability, stylistic feature, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.eswa.2025.126949

2502.20299

Country:

North America > United States > Missouri (0.04)
Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)
Asia > Philippines (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neurobiber: Fast and Interpretable Stylistic Feature Extraction

Alkiek, Kenan, Wegmann, Anna, Zhu, Jian, Jurgens, David

arXiv.org Artificial IntelligenceFeb-25-2025

Linguistic style is pivotal for understanding how texts convey meaning and fulfill communicative purposes, yet extracting detailed stylistic features at scale remains challenging. We present Neurobiber, a transformer-based system for fast, interpretable style profiling built on Biber's Multidimensional Analysis (MDA). Neurobiber predicts 96 Biber-style features from our open-source BiberPlus library (a Python toolkit that computes stylistic features and provides integrated analytics, e.g., PCA and factor analysis). Despite being up to 56 times faster than existing open source systems, Neurobiber replicates classic MDA insights on the CORE corpus and achieves competitive performance on the PAN 2020 authorship verification task without extensive retraining. Its efficient and interpretable representations readily integrate into downstream NLP pipelines, facilitating large-scale stylometric research, forensic analysis, and real-time text monitoring. All components are made publicly available.

biber, computational linguistic, iber, (16 more...)

arXiv.org Artificial Intelligence

2502.1859

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Michigan (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(7 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Communications > Social Media (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Data Science > Data Mining > Feature Extraction (0.50)
(2 more...)

Add feedback

StAyaL | Multilingual Style Transfer

Thakrar, Karishma, Lawrence, Katrina, Howard, Kyle

arXiv.org Artificial IntelligenceJan-21-2025

Stylistic text generation plays a vital role in enhancing communication by reflecting the nuances of individual expression. This paper presents a novel approach for generating text in a specific speaker's style across different languages. We show that by leveraging only 100 lines of text, an individuals unique style can be captured as a high-dimensional embedding, which can be used for both text generation and stylistic translation. This methodology breaks down the language barrier by transferring the style of a speaker between languages. The paper is structured into three main phases: augmenting the speaker's data with stylistically consistent external sources, separating style from content using machine learning and deep learning techniques, and generating an abstract style profile by mean pooling the learned embeddings. The proposed approach is shown to be topic-agnostic, with test accuracy and F1 scores of 74.9% and 0.75, respectively. The results demonstrate the potential of the style profile for multilingual communication, paving the way for further applications in personalized content generation and cross-linguistic stylistic transfer.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.11639

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback